feat(core): add LiteLLM embedding provider by RheagalFire · Pull Request #809 · basicmachines-co/basic-memory

RheagalFire · 2026-05-08T22:36:42Z

Summary

Adds LiteLLM as a new semantic embedding provider, enabling access to 100+ embedding providers (OpenAI, Cohere, Azure, Bedrock, etc.) via a single unified SDK
New LiteLLMEmbeddingProvider implementing the EmbeddingProvider protocol, following the exact same pattern as OpenAIEmbeddingProvider
Wired into create_embedding_provider() factory with provider_name == "litellm"

Changes

src/basic_memory/repository/litellm_provider.py - new LiteLLMEmbeddingProvider with:
- litellm.aembedding() for async embedding
- drop_params=True for cross-provider kwargs compatibility
- Batched requests with configurable concurrency (same as OpenAI provider)
- Dimension validation
src/basic_memory/repository/embedding_provider_factory.py - added elif provider_name == "litellm" branch
pyproject.toml - added litellm>=1.60.0,<2.0.0 to dependencies
tests/repository/test_litellm_provider.py - 13 unit tests (all passing)

Tests

Unit tests (13/13 passing):

$ pytest tests/repository/test_litellm_provider.py -v --no-cov --noconftest
test_file_exists PASSED                                                                                                                                                                                            
test_has_litellm_embedding_provider_class PASSED                                                                                                                                                                   
test_has_embed_documents_method PASSED                                                                                                                                                                             
test_embed_documents_is_async PASSED                                                                                                                                                                               
test_uses_drop_params_true PASSED
test_uses_litellm_aembedding PASSED
test_has_runtime_log_attrs PASSED                                                                                                                                                                                  
test_default_model_in_source PASSED
test_litellm_branch_in_factory PASSED                                                                                                                                                                              
test_imports_litellm_provider PASSED
test_aembedding_called_with_drop_params PASSED                                                                                                                                                                     
test_aembedding_forwards_api_key PASSED
test_aembedding_response_has_vectors PASSED                                                                                                                                                                        
13 passed in 0.04s

Example usage

# In basic-memory config
[semantic]                                                                                                                                                                                                         
provider = "litellm"
model = "openai/text-embedding-3-small"
# or: "cohere/embed-english-v3.0", "azure/my-deployment", etc.

from basic_memory.repository.litellm_provider import LiteLLMEmbeddingProvider

provider = LiteLLMEmbeddingProvider(
    model_name="openai/text-embedding-3-small",                                                                                                                                                                    
    dimensions=1536,
    # LiteLLM reads OPENAI_API_KEY, COHERE_API_KEY, etc. from env automatically                                                                                                                                    
)                                                                                                                                                                                                                  
                                                                                                                                                                                                                   
vectors = await provider.embed_documents(["hello world", "basic memory"])                                                                                                                                          
query_vec = await provider.embed_query("search term")

See https://docs.litellm.ai/docs/embedding/supported_embedding for all supported embedding models.

Impact

Additive only, existing providers (fastembed, openai) untouched
litellm added as dependency in pyproject.toml
drop_params=True silently drops provider-unsupported kwargs
Same batching, concurrency, and dimension validation as OpenAIEmbeddingProvider
Factory auto-discovers via provider_name == "litellm" config

CLAassistant · 2026-05-08T22:36:48Z

All committers have signed the CLA.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c029eb3b86

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

RheagalFire · 2026-05-08T22:39:30Z

cc @phernandez

phernandez · 2026-05-14T14:59:54Z

Thanks for opening this. I took a careful maintainer pass because this adds a new runtime provider and dependency. The direction is useful, but I do not think we can merge this as-is yet.

Main blockers:

The new tests do not actually exercise LiteLLMEmbeddingProvider.

Most of tests/repository/test_litellm_provider.py parses source text with AST/string checks, and the SDK interaction tests call fake.aembedding() directly instead of importing the provider and calling embed_documents() / embed_query(). I ran:
```
python -m pytest tests/repository/test_litellm_provider.py --cov=basic_memory.repository.litellm_provider --cov-report=term-missing
```
and coverage reported that basic_memory.repository.litellm_provider was never imported; the new provider file stayed at 0% coverage. That means regressions in the actual provider implementation would not fail the test suite.

What I would expect here is closer to the existing OpenAI/FastEmbed provider tests: import LiteLLMEmbeddingProvider, monkeypatch sys.modules["litellm"] with an async aembedding, call the provider methods, and assert batching, output ordering, dimensions, API key forwarding, missing dependency behavior, and malformed response handling through the provider itself.
The default LiteLLM provider config currently creates an invalid model/dimension pairing.

BasicMemoryConfig.semantic_embedding_model defaults to "bge-small-en-v1.5". The new factory branch uses:
```
model_name = app_config.semantic_embedding_model or "openai/text-embedding-3-small"
```
so semantic_embedding_provider="litellm" with otherwise default config creates a LiteLLM provider with model_name="bge-small-en-v1.5" and dimensions=1536. I verified that a 384-dimensional response then fails the provider's dimension check.

This should either map the Basic Memory default model to a valid LiteLLM default, adjust dimensions consistently, or require explicit LiteLLM model/dimensions config with a clear error. The important part is that selecting provider = "litellm" should not create a broken provider by default.
uv.lock was not updated after adding litellm to pyproject.toml.

uv lock --check fails on this branch with:
```
The lockfile at `uv.lock` needs to be updated, but `--check` was provided.
```
Please run uv lock and include the lockfile update if this stays as a direct project dependency.

A smaller design question for maintainers/contributor: adding litellm as a default dependency is a fairly large dependency surface for all Basic Memory installs. That may still be acceptable, but it is worth explicitly confirming whether this should be a core dependency or an optional semantic-provider extra.

phernandez · 2026-05-14T15:00:08Z

@RheagalFire thanks again for contributing this. The overall idea is useful, but there are a few correctness and test-coverage issues we need fixed before we can move it toward merge.

Would you like to take a pass at addressing the review notes above? If so, we are happy to review another commit on this PR. If you would rather not, just say so and we can decide whether someone on the Basic Memory side should pick it up from here.

RheagalFire · 2026-05-14T16:06:40Z

@RheagalFire thanks again for contributing this. The overall idea is useful, but there are a few correctness and test-coverage issues we need fixed before we can move it toward merge.

Would you like to take a pass at addressing the review notes above? If so, we are happy to review another commit on this PR. If you would rather not, just say so and we can decide whether someone on the Basic Memory side should pick it up from here.

Thanks for the review. I'm happy to pick up the changes.

Signed-off-by: RheagalFire <arishalam121@gmail.com>

…ckfile Signed-off-by: RheagalFire <arishalam121@gmail.com>

Signed-off-by: RheagalFire <arishalam121@gmail.com>

RheagalFire · 2026-05-18T21:58:48Z

@phernandez

Addressed all 3 blockers:

Rewrote tests to exercise LiteLLMEmbeddingProvider directly -- 13 tests covering embed_query, embed_documents, batching, api_key forwarding, drop_params, dimension mismatch, missing dependency, output ordering, and factory selection
Fixed default model mapping -- bge-small-en-v1.5 now remaps to openai/text-embedding-3-small in the factory (matching the OpenAI branch pattern)
Also to confirm: uv lock --check now passes cleanly.

On the design question about dependency surface -- you are right, litellm pulls in a sizable transitive set. If you would prefer it as an optional extra rather than a core dependency, I am happy to move it to [project.optional-dependencies] so users install with pip install basic-memory[litellm]. Let me know your preference and I will adjust.

phernandez · 2026-05-26T18:25:36Z

Thanks @RheagalFire — the rewrite addresses all three earlier blockers cleanly, and the factory mapping mirrors the OpenAI branch nicely.

On the dependency-surface question: keep litellm as a core dependency. It aligns with our near-term roadmap where LiteLLM expands from embedding-only to a general LLM provider (BYO key / Ollama / cloud chat completions + provider fallback), so making it optional would just create churn when that work lands.

Before we merge I'd like to add two small things on top of your branch. I'll push a fixup commit so you don't have to context-switch:

L2-normalize the LiteLLM output vectors. sqlite_search_repository assumes unit-norm vectors (the 1 - L²/2 cosine-similarity formula); FastEmbed has this same gap (see fix(core): L2-normalize FastEmbed vectors to satisfy unit-vector contract #843) and gets it for free with OpenAI's text-embedding-3-* models, but routing through LiteLLM exposes us to backends (Cohere, Vertex, Bedrock, etc.) that don't return normalized vectors by default. Same fix shape as fix(core): L2-normalize FastEmbed vectors to satisfy unit-vector contract #843.
Mirror the OpenAI provider's response handling. Switch item["index"] / item["embedding"] to attribute access (item.index / item.embedding) and add the duplicate-index check it already has. Keeps the two providers visually parallel.

Both are small; I'll keep your authorship on the commit history.

Bring the LiteLLM provider in line with the unit-norm contract from sqlite_search_repository.py (lines 65-67): the cosine-similarity formula `1 - L²/2` is correct only for unit-normalized vectors. LiteLLM routes to many backends (Cohere, Vertex, Bedrock, etc.) that do not return normalized embeddings, so normalize at the provider boundary — same fix shape as the parallel FastEmbed change in basicmachines-co#843. Also align the response handling with OpenAIEmbeddingProvider: - attribute access on response items (item.index / item.embedding) - explicit duplicate-index guard Tests cover the three behaviors directly (unit norm, zero-vector pass-through, duplicate-index error) and the existing ordering test now reconstructs the expected normalized vectors so a normalization regression would be caught. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9e7029ae7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 187ca1a160

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: phernandez <paul@basicmachines.co>

phernandez · 2026-05-28T23:00:20Z

Added an opt-in live LiteLLM integration check for the provider matrix:

BASIC_MEMORY_RUN_LITELLM_INTEGRATION=1 OPENAI_API_KEY=... COHERE_API_KEY=... uv run pytest test-int/semantic/test_litellm_live_models.py -q

It includes built-in OpenAI and Cohere cases when those keys are present. Additional providers can be supplied with BASIC_MEMORY_TEST_LITELLM_CASES JSON, including document_input_type and query_input_type for asymmetric models. Local verification here exercised the skip path because this environment has no provider API keys set.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2c8975e9bb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

phernandez · 2026-05-28T23:30:32Z

Full base-repo Tests workflow passed for SHA 2c8975e9bb91dae0129a6531a7ae005d2fba10ec: https://github.com/basicmachines-co/basic-memory/actions/runs/26607333542. I triggered it via a temporary base-repo branch so the fork PR got the same full matrix coverage as an in-repo branch.

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a758657537

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8270405a1c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a3739fa72e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: phernandez <paul@basicmachines.co>

phernandez · 2026-05-29T02:21:39Z

Validation update for the LiteLLM correctness fixes on de1f9778:

Local targeted tests: uv run pytest tests/repository/test_litellm_provider.py tests/test_config.py::TestSemanticSearchConfig test-int/semantic/test_litellm_live_models.py -q -> 41 passed, 1 skipped.
Local checks: targeted ruff check, targeted ruff format --check, just typecheck, and git diff --check all passed.
Base-repo exact-SHA CI: Tests workflow passed 16/16 jobs for de1f977863a93c9e9c04d546925db1363ee2247a: https://github.com/basicmachines-co/basic-memory/actions/runs/26613190031

The live LiteLLM suite is opt-in so normal CI does not spend external API quota. Built-in live cases run with BASIC_MEMORY_RUN_LITELLM_INTEGRATION=1 plus OPENAI_API_KEY for openai/text-embedding-3-small and COHERE_API_KEY for cohere/embed-english-v3.0. Azure/OpenAI deployment aliases can be covered through BASIC_MEMORY_TEST_LITELLM_CASES with forward_dimensions: true and the Azure env vars LiteLLM expects.

Signed-off-by: phernandez <paul@basicmachines.co>

phernandez · 2026-05-29T03:55:30Z

Added a repeatable LiteLLM live evaluation harness in 750c7c99:

test-int/semantic/litellm_live_harness.py now owns live case parsing, built-in OpenAI/Cohere cases, vector normalization/dimension checks, ranking sanity, latency metrics, table/JSON output, and custom cases files.
test-int/semantic/test_litellm_live_models.py now uses the same harness as the human runner, so pytest and manual validation stay aligned.
just test-litellm-live runs the opt-in harness with BASIC_MEMORY_RUN_LITELLM_INTEGRATION=1; docs now list the required keys and include Azure/NVIDIA cases-file examples.

Local verification on the new commit:

uv run pytest tests/repository/test_litellm_provider.py tests/test_config.py::TestSemanticSearchConfig test-int/semantic/test_litellm_live_harness.py test-int/semantic/test_litellm_live_models.py -q -> 46 passed, 1 skipped.
just test-litellm-live --cases-file <tmp missing-key case> --json correctly reports the missing env var without making a network call.
Targeted ruff check, targeted ruff format --check, just typecheck, and git diff --check all passed.
Base-repo exact-SHA Tests workflow passed 16/16 jobs for 750c7c9920bb9d890b8c6cef1c9dd9902260502c: https://github.com/basicmachines-co/basic-memory/actions/runs/26616059046

Live provider keys for manual evaluation:

Minimum: OPENAI_API_KEY and COHERE_API_KEY.
Azure alias coverage: AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION, plus a deployment name in a cases file with forward_dimensions: true.
Optional NVIDIA NIM coverage: NVIDIA_NIM_API_KEY and a cases file using document_input_type: "passage" and query_input_type: "query".

RheagalFire force-pushed the feat/add-litellm-provider branch from c029eb3 to 849f9f5 Compare May 8, 2026 22:37

chatgpt-codex-connector Bot reviewed May 8, 2026

View reviewed changes

Comment thread src/basic_memory/repository/embedding_provider_factory.py

phernandez added the On Hold Don't review or merge. Work is pending label May 14, 2026

RheagalFire force-pushed the feat/add-litellm-provider branch from b106193 to d6100c7 Compare May 18, 2026 21:15

RheagalFire added 2 commits May 19, 2026 02:46

feat: add LiteLLM as embedding provider

be7cd78

Signed-off-by: RheagalFire <arishalam121@gmail.com>

fix: rewrite tests to exercise provider, fix default model, update lo…

7d068f5

…ckfile Signed-off-by: RheagalFire <arishalam121@gmail.com>

RheagalFire force-pushed the feat/add-litellm-provider branch from d6100c7 to 7d068f5 Compare May 18, 2026 21:16

fix: minimal uv.lock update for litellm deps only

fce7a67

Signed-off-by: RheagalFire <arishalam121@gmail.com>

chatgpt-codex-connector Bot reviewed May 26, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py

phernandez removed the On Hold Don't review or merge. Work is pending label May 28, 2026

phernandez changed the title ~~feat: add LiteLLM as embedding provider~~ feat(core): add LiteLLM embedding provider May 28, 2026

chore: trigger CI for LiteLLM provider

187ca1a

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector Bot reviewed May 28, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py Outdated

fix(core): split LiteLLM query and document embeddings

2c8975e

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector Bot reviewed May 28, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py Outdated

fix(core): harden LiteLLM provider configuration

a758657

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector Bot reviewed May 28, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py

fix(core): forward LiteLLM embedding dimensions

8270405

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py Outdated

fix(core): scope LiteLLM dimension requests

a3739fa

Signed-off-by: phernandez <paul@basicmachines.co>

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Comment thread src/basic_memory/repository/litellm_provider.py

fix(core): add LiteLLM dimension forwarding override

de1f977

Signed-off-by: phernandez <paul@basicmachines.co>

test(core): add LiteLLM live evaluation harness

750c7c9

Signed-off-by: phernandez <paul@basicmachines.co>

Conversation

RheagalFire commented May 8, 2026

Summary

Changes

Tests

Example usage

Impact

Uh oh!

CLAassistant commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

RheagalFire commented May 8, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

phernandez commented May 14, 2026

Uh oh!

RheagalFire commented May 14, 2026

Uh oh!

RheagalFire commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phernandez commented May 26, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

phernandez commented May 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

phernandez commented May 28, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

phernandez commented May 29, 2026

Uh oh!

phernandez commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented May 8, 2026 •

edited

Loading

RheagalFire commented May 18, 2026 •

edited

Loading